-
Notifications
You must be signed in to change notification settings - Fork 3.2k
feat: Add CloudWatch logs subscription handling support to awslambdareceiver #44562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add CloudWatch logs subscription handling support to awslambdareceiver #44562
Conversation
2ae6e5d to
e9aa82a
Compare
axw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks fine, but I'd like to revisit the config. Implicitly making it CloudWatch-based because the S3 encoding isn't configured is a bit unintuitive.
Thanks @axw for this remark. Personally, I also saw this to be misleading (detecting CW logs operation mode when the S3 encoder is missing felt wrong). I am exploring a new strategy for this but it will be a considerable refactoring. And I think this is the right time to fix this. What I am considering is the following,
I am actively working on this improvement and will update this as a single commit to have focus. But let me know if you think of a different approach. |
How would that encoding extension be used for CloudWatch Logs? They have a specific structure -- they must be decoded in the CloudWatch log subscription filter format. I think it would make sense to use it for decoding the message field, so e.g. you could use awslogs_encoding to decode CloudTrail logs sent to CloudWatch. You could support both S3 and CloudWatch events in the one Lambda, but they would have to have the same encoding, which still adds a little bit of cognitive overhead. Another option would be to have two separate attribute groups, along these lines: receivers/
awslambda:
s3:
encoding: ... # required for handling S3 events
cloudwatch_logs:
message_encoding: ... # optional; defaults to using the message field as the log record bodyI personally think that's easier to reason about, since it keeps the two event types completely separate. Configuring the encoding for one event type has no impact on the other. |
|
Reading your suggestions, I agree that keeping the configuration for s3 and cloudwatch_logs separate is the cleaner approach.
Yes, exactly. If However, there is a significant technical challenge here regarding the input format expectations.
However, the awslogs encoding extension (e.g., for CloudTrail) expects a specific root-level structure that mirrors the original S3 file format:
I agree this implementation detail can be tackled in a follow-up PR, but it's important to note that supporting message_encoding might require the receiver to have some logic to "adapt" the stream (delimiter/wrapping) based on the target encoding type. |
Signed-off-by: Kavindu Dodanduwa <[email protected]>
Signed-off-by: Kavindu Dodanduwa <[email protected]> # Conflicts: # receiver/awslambdareceiver/go.sum
ed8908a to
93bb5dd
Compare
Signed-off-by: Kavindu Dodanduwa <[email protected]>
93bb5dd to
fbd585f
Compare
|
@axw @MichaelKatsoulis thanks for the feedback. Let me clarify my view and let's see if we can come to conclusion on this. First, let's break down receiver operation modes. The lambda receiver can operate for both logs & metrics. Each operation mode can then have sub-trigger modes. Currently, we support the following,
The detection of trigger type is easy and is already done by Once we obtain the content, that's where we need the help of encoding extension. So if the user is forwarding VPC flow logs to S3, then user must use receivers:
awslambda:
encoding_extension: awslogs_encoding
extensions:
awslogs_encoding:
format: vpcflow
vpcflow:
file_format: plain-tex
If we accept the proposed change in this PR, then the user must define receivers:
awslambda:
encoding_extension: awslogs_encoding
extensions:
awslogs_encoding:
format: cloudwatchHowever, this is where we sort of misuse the CloudWatch subscription filter. Because in the implementation 2, we are just reading the content of the message and set it to the scope log log record. In other words, if we apply the same usage of S3 event trigger, we are performing content download of the S3 object through an encoding extension. Because of this, we have the encoder chaining need, and the complexity comes through this is explained by @MichaelKatsoulis at #44562 (comment) Improving CloudWatch subscription trigger modeAs the next step I have a suggestion. For the CloudWatch trigger we must move message extraction into the Lambda receiver. Once we have the message content, we can choose two paths, 1 Decode the message further using the defined The first path fixes the encoding chaining issue. And we can use improved encoding extensions (see note 3) to decode CW logs. The second path exactly matches what we have now. Defining this operation mode needs some refactoring and can be specific to the CW event type (ex:- @axw @MichaelKatsoulis let me know your thoughts. Also check the current state where commit fbd585f contains changes towards this direction. Footnotes
|
Description
This PR adds CloudWatch logs Lambda subscription handling support to
awslambdareceiver.With this change, the receiver can now be deployed as a Lambda and support handling either of the signals: S3 triggers or CloudWatch subscription logs.
Note - Receiver can only handle one signal type per deployment.
Link to tracking issue
Completes part of #43504
Testing
I have added unit tests to cover most of the new functionality. One of them (
TestHandleCloudwatchLogEvent) provides end to end validation of a CloudWatch event to plogs conversion.Documentation
Existing documentation already covered some aspects. However, I have attempted to uplift them along with code level comments.